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1.  INTRODUCTION. 


In  this  article,  we  consider  some  very  elementary  but  important  problems 
which  arise  from  modern  uses  of  the  computer  in  statistics,  particularly  in 
connection  with  testing  goodness-of-fit.  These  involve  (a)  estimating 
percentage  points;  (b)  simulating  a  Gaussian  process;  and  (c)  approximating 
the  inverse  of  a  covariance  matrix  of  order  statistiscs. 


2.  ESTIMATING  PERCENTAGE  POINTS. 


The  first  problem  is  that  of  estimating  percentage  points  of  an 
intractable  distribution.  For  example,  the  distributions  of  many 
goodness-of-fit  statistics  are  very  difficult  to  find,  particularly  for  a 
finite  sample,  and  particularly  if  parameters  are  estimated.  The  standard 
procedure  is  to  simulate  the  situation  considered,  and  calculate  the 
statistic,  say  S;  then  repeat  this  n  times  to  obtain  the  Monte  Carlo 
distribution  of  S.  The  p-th  percentile,  for  example,  would  be  estimated  by 
the  [np]+l  order  statistic  of  the  S-sample.  Some  years  ago,  Schafer  (1976) 
suggested  that  it  would  be  better  to  take,  say,  c  samples  of  m-n/c  Monte 
Carlo  values,  and  take  the  average  of  the  c  estimates  of  the  p-th 


percentile  as  the  overall  estimate;  thus  if  denotes  the  k-th  order 

c 

statistic  of  the  i-th  subsample,  the  estimate  would  be  given  by  E.  ..  S.  ,/c 
where  k  -  [mp]+l.  Schafer's  suggestion  was  investigated  by  two  colleagues 


and  myself  (Juritz,  Juritz,  and  Stephens,  1983);  we  showed  that  the  bias  in 


the  estimate  is  less  for  the  estimate  from  one  full  sample  than  it  is  for  the 


estimate  from  the  mean  of  c  subsamples,  except  in  somewhat  contrived  cases, 
while  the  confidence  intervals  for  the  estimates  were  approximately  the  same 
size.  This  work  was  subsequently  confirmed  by  Dudewicz  and  van  der  Meulen 
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(1984),  who  illustrated  their  work  on  a  statistic  they  had  introduced  for 
testing  uniformity.  Zelterman  (1987)  also  discusses  this  problem. 


Bias  In  estimates  of  percentage  points. 

Of  course,  for  certain  distributions,  exact  values  of  m^,  the  expected 

values  of  X  ,  will  be  known,  and  the  bias  in  the  Monte  Carlo  estimate  can 

be  estimated.  Thus,  for  the  normal  distribution,  for  a  sample  of  size  100, 

the  95%  point  will  be  estimated  by  Xro,.,  which  has  expected  value  1.6872, 

'  l’bJ 

to  compare  with  the  true  percentile  value  of  1.6449.  In  Juritz,  Juritz,  and 

Stephens  (1983)  we  published  a  table  showing  the  bias  for  the  95%  and  97.5% 

points  for  the  standard  normal  distribution,  and  for  various  sample  sizes. 

Table  1  below  is  a  similar  table  for  the  standard  exponential  distribution 

with  mean  1.  The  values  of  r  and  s  are  those  for  which  the  interval 

I  -  (X.  .  ,X.  .  )  is  a  95%  confidence  interval  for  the  true  percentile.  These 

are  found  to  high  accuracy  as  follows.  Let  £  be  the  percentile  at  level 

p,  let  w  -  (np(l-p) )1/Z,  and  let  z^  denote  the  (IOO7)  percentile  of  the 

standard  normal  distribution.  Then  the  choice  r  -  -w  z  +  np  +  —  and 

7  2 

s-wz  +np+i,  where  r  -  l-a/2,  gives  a  (1-a)  100%  confidence  interval 
7  2 

for  i  . 

P 


The  approximate  confidence  interval  length  is  obtained  from  a  formula 
in  Juritz,  Juritz  and  Stephens  (1983).  It  is  clear  that  ve  bias  diminishes 
with  sample  size;  since  the  mean  of  several  estimates  will  give  the  same 
bias,  it  is  better  to  use  one  larger  sample  for  a  point  estimate. 

Confidence  intervals  for  percentage  points. 

Although  in  practice  point  estimates  of  percentage  points  are  nearly 
always  all  that  are  given  when  tables  are  produced,  it  is  useful  to  examine 
confidence  intervals  for  the  points,  as  a  guide  to  their  accuracy.  In  Table 
2  the  multi-sample  and  single  -  sample  confidence  intervals  are  compared  for 
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the  95%  point  of  a  standard  normal  distribution,  with  true  value  1.645.  The 
intervals  are  obtained  from  6  Monte  Carlo  tests,  for  each  of  two  comparisons. 
In  the  left-hand  half  of  Table  2  the  comparison  is  between  c  -  10  runs  each 
of  size  m  -  100,  (the  multi-sample  case)  and  one  run  of  size  1000  (the 
single-sample  case).  In  the  right  hand  half  of  the  Table,  the  multi-sample 
case  has  c  -  10  and  m  -  500,  to  compare  with  the  single -sample  n  «  5000. 

Consider  the  left-hand  part  of  Table  2.  For  each  of  the  6  tests,  in  the 

multi-sample  case,  the  estimate  is  X  ,  the  mean  of  the  10  values  of  X  , 

P  P 

where  X  is  X...  with  i  -  96.  The  standard  deviation  a  of  X,„.  is 
P  (-0  p  (i) 

0.210,  found  from  the  formula  in  David  (1970,  p.  65).  This  is  to  be  compared 

with  S,  the  standard  deviation  of  the  10  values  of  X^^  .  For  each  test, 

the  95%  confidence  limits  for  f  have  been  found,  and  the  length  I  of 

p  °  m 

the  confidence  interval  recorded. 

In  the  single-sample  case,  the  estimate  of  £  -  with  i  -  951. 

For  each  test,  the  confidence  limits  have  been  found  from  X.  .  and  X.  ., 

(r)  (s)’ 

with  r,s  as  given  above,  and  the  length  Ig  recorded. 

From  the  top  half  of  the  table,  it  can  be  seen  that  the  experimental 

values  S  cluster  around  the  theoretical  value  a  -  0.210.  More 

P 

importantly,  the  confidence  intervals  from  the  top  and  bottom  parts  of  the 
table  are  approximately  the  same,  as  was  shown  by  Juritz,  Juritz  and  Stephens 
(1983),  but  the  bias  in  the  one  large  run  is  much  less  than  in  the 
multi-sample  case.  These  conclusions  are  supported  by  the  right-hand  half  of 
the  table,  where,  in  the  top  part,  £  -  476  and  in  the  bottom  part,  X^  -  X^^ 
with  £  -  4751.  Thus,  the  studies  demonstrate  conclusively  the  advantage  of 
the  single  sample  over  the  multi-sample  method.  Dudewicz  and  van  der  Meulen 
(1984)  discuss  confidence  intervals  further. 
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Estimates  obtained  from  approximations  using  moments. 

We  now  turn  to  another  idea;  would  it  perhaps  be  better  to  approximate 
percentiles  by  calculating  the  sample  moments  from  the  1000  (say)  values  in 
the  large  Monte  Carlo  run,  and  fitting  a  suitable  curve  to  the  moments  to 
approximate  the  distribution?  Several  curve-fitting  families  are  useful  for 
such  a  purpose,  especially  when,  as  for  many  statistics,  the  distribution 
required  can  be  expected  to  be  smooth.  For  many  goodness-of-fit  statistics, 
the  values  are  usually  required  in  the  longer  tail  (usually  the  upper 
tail';  for  such  purposes,  Pearson  curves  using  4  moments  to  make  the  fit  have 
been  found  to  be  very  useful.  It  should  be  emphasized  that  this  has  been  the 
case  when  theoretical  (that  is,  exact)  moments  could  be  calculated.  Here  we 
propose  to  experiment  with  sample  moments,  based  on  large  samples.  When 
also,  as  often  happens,  the  lower  end  point  of  the  distribution  is  known 


(often  it  is  zero),  a  3-moment  Pearson  curve  fit  can  be  tried,  or  a  3-moment 


fit  of  the  form  (c*^)  .  Since  higher  sample  moments  have  notoriously  high 
sampling  variability,  there  is  something  to  be  said  for  approximations  using 


only  3  moments. 


We  have  recently  explored  the  curve-fitting  possibility,  as  opposed  to 
direct  estimation  of  the  percentile  from  the  Monte  Carlo  sample,  by  again 
taking  samples  from  distributions  for  which  the  exact  percentiles  are  known. 


Example  1 .  The  Weibull  distribution.  The  first  illustration  is  for  a 

variable  x  which  has  a  Weibull  distribution  with  shape  parameter  2,  that 
2 

is,  x  has  the  standard  exponential  distribution.  Thus,  the  exact  value  of 
1/2 

-  (-log(l-p)]  The  steps  in  the  curve-fitting  technique  are: 

(a)  Take  a  sample  of  size  n  (say  1000)  from  the  Weibull  distribution,  and 

estimate,  say  £  by  ,  where  k  -  [npj+l. 

(b)  Also,  calculate  the  first  four  sample  moments  M^.,  r  -  1,...,4,  where 
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M'  -  XxT/n. 
r  1 

(c)  Fit  either  a  4-moment  or  a  3-moment  Pearson  curve  (knowing  the  lower  end 

of  the  distribution  is  zero)  and  find  the  estimate  Y  (4-moment  fit)  or  Z 

P  P 

(3-moment  fit)  of  £ 

(d)  Repeat  steps  (a),  (b)  and  (c)  50  times,  to  obtain  50  estimates  by  each 
method. 

(e)  Calculate  the  average,  the  variance,  and  the  mean  square  error  (MSE)  of 
the  50  estimates;  from  the  average  and  the  known  value,  we  can  estimate  the 
bias . 

The  experiment  can  be  repeated  for  different  sample  sizes;  we  used 
n  -  100,  200,  500  and  1000.  These  sample  sizes  are  small  compared  with  those 
commonly  used  in  Monte  Carlo  studies,  but  the  trend  of  the  results  can  easily 
be  seen. 

Table  3  gives  a  comparison  of  points  all  along  the  distribution,  for 
both  Pearson  curve  fits,  for  n  -  100  and  n  -  1000.  Type  1  refers  to  the 
4-moment  fit,  and  Type  2  to  the  3-moment-and-lower-end-point  fit. 

Comments  on  Table  3. 

(a)  The  Pearson  curve  fits  often  give  somewhat  greater  bias  to  the 
estimate,  but  there  is  a  smaller  variance,  so  that  the  MSE  of  the  Pearson 
curve  estimate  is  better  than  for  the  Monte  Carlo  estimate. 

(b)  As  expected,  the  3-moment  fit  does  better  in  the  lower  tail;  but  it 
is  only  very  slightly  worse  in  the  upper  tail;  the  marginal  diference 
suggests  that  the  3-moment  fit  is  to  be  preferred. 

(c)  There  is  a  marked  improvement  in  MSE  as  n  gets  larger  in  both 
methods,  as  one  would  expect;  however,  the  relative  sizes  of  MSE  for  Pearson 
curve  fits  compared  to  straight  Monte  Carlo  estimation  are  still  somewhat 
smaller  as  n  increases. 
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Many  goodness-of-f it  statistics  have  asymptotic  distributions  which  are 
sums  of  weighted  chi-squares  (see,  e.g.,  Stephens,  1976,  1977,  1979),  so  our 
next  example  is  a  comparison  for  such  a  distribution.  Statistic  X  has  the 
distribution  X  -  .  3z.,  +  . 2z„  +  .  2z_  +  .  lz.  +  .  lzc  +  . lz, ,  where  z  are 

12  3  4  3  0 

2 

independent  variables.  Table  4  gives  a  comparison  of  Monte  Carlo  and 

Pearson  curve  points  as  before,  and  again  Pearson  curves  perform  well  in 
estimating  points,  measured  by  the  MSE. 

Implications  for  the  bootstrap. 

There  are  some  interesting  possible  implications  from  this  result.  The 
bootstrap  is  now  a  very  popular  method  for  deducing  properties  of  a 
statistic;  the  statistic  is  calculated  many  times  over,  by  resampling  from 
one  sample.  In  its  simplest  form,  the  empirical  distribution  function  (EDF) 
of  the  sample  is  used  to  estimate  the  population  distribution  and  samples  are 
then  drawn  from  the  estimate.  It  might,  in  some  circumstances,  be  better  to 
approximate  the  parent  population  by  a  smooth  curve,  such  as  a  Pearson  curve, 
fitted  to  the  sample  moments,  and  then  to  draw  samples  from  the  Pearson  curve 
distribution  when  estimating  properties  of  the  relevant  statistic  by  Monte 
Carlo  methods. 


3.  SIMULATION  OF  A  GAUSSIAN  PROCESS 
It  is  extremely  useful,  when  finding  percentage  points  for  test 
statistics  by  Monte  Carlo,  to  calculate  the  asymptotic  points,  instead  of 
estimating  them  as  described  at  the  end  of  the  previous  section;  then  when 
the  percentage  points  at  level  p  are  plotted  against  1/n  or  l//n,  the 
curve  is  "anchored"  at  1/n-O.  If  the  curve  can  then  be  drawn  with 
confidence,  one  can  deduce  percentage  points  for  quite  large  samples  without 
incurring  the  expense  of  large-sample  Monte  Carlo  studies.  Many 
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goodness-of-fit  statistics  are  functionals  of  a  process  which  is 

asymptotically  a  Gaussian  process;  such  a  process,  based  on  the  EDF,  is  often 

tied  down  at  0  and  at  1,  and  is  referred  to  as  a  Brownian  bridge.  Thus  we 

wish  to  simulate  a  Brownian  bridge,  a  Gaussian  process  Z(t)  with  Z(0)  -  0 

and  Z(l)  -  0,  and  with  known  mean  (often  zero)  and  known  covariance  p(s,t). 

For  example,  the  Kolmogorov  statistic  D  is  the  supremura  of  Z(t),  and  the 

2  12 

Cramer-von  Mises  statistic  W  is  Z  (t)dt. 

Monte  Carlo  simulation  of  the  process  Z(t)  is  difficult,  and  leads  to 

2 

further  difficulties  in  finding  the  asymptotic  distribution  of  D  or  W  . 

Of  necessity,  on  a  computer,  Z(t)  must  be  discretized.  One  way  to  construct 
a  discrete  approximation  to  Z(t)  is  as  follows: 

(a)  Choose  values  t^ , , . . . , t^  equally  spaced  between  0,1. 

(b)  Generate  u^ ,  a  standard  normal  variate,  at  t^,  i  -  1 . k.  Let 

u'  -  vector  ,u2 ,  .  .  .  ,1^)  . 

(c)  Create  V,  a  k  x  k  matrix  with  entries  -  p(t^,tj),  where  p(s,t) 

is  the  covariance  function  of  the  Gaussian  process  Z(t) .  Suppose  W  is  the 
square  root  matrix  of  V,  that  is,  W  -  V  .  Since  V  is  positive  definite 
this  is  easily  obtained.  Suppose  V  —  P  A  P'  where  P  is  orthogonal  and  A 
is  diagonal,  with  elements  on  the  main  diagonal  equal  to  A^ , , . . . , A^.  Then 

’A*  it 

W  -  P  A  P' ,  where  A  is  diagonal  with  elements  /A^,  i  -  l,...,k. 

(d)  Let  z'  be  vector  (z^ , , . . . , z^) ,  given  by  z  -  Wu. 

(e)  Then  let  Z(t^),  the  estimate  of  Z(t),  be  z^.  The  mean  E(z)  -  0,  and 
the  covariance  E(zz’)  -  E(Wuu'W')  —  V.  Thus  the  covariance 
E(Z(ti)Z(tj)  -  p(ti,tj)  and  the  values  ZCtj),  i  -  l,...,k  give  a  discrete 
k-variate  approximation  to  the  continuous  Z(t).  Note  also  that  there  can  be 
other  matrices  W  such  that  WW'  -  V,  so  that  various  approximations  are 
possible . 
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Suppose  the  number  of  points  k  is  called  the  order  of  the 
approximation . 

Even  when  an  approximation  to  Z(t)  has  been  created,  there  are  clearly 

2 

further  approximations  involved  in  calculating  D  or  W  .  The  above 
procedure  must  be  repeated  n  times,  say,  to  find  the  distribution  of  the 
statistics.  One  might  then  suppose  that  the  percentage  points  of  the 
asymptotic  distribution  of,  say  D,  will  be  found  as  the  limit  of  the 
smoothed  Monte  Carlo  values  plotted  against  1/k  or  l//k,  as  k  becomes 

larger.  Unfortunately  as  k  becomes  larger,  the  manipulation  of  the  k  x  k 

1/2 

matrices  V  and  V  becomes  increasingly  prone  to  numerical  errors. 
Chandra,  Singpurwalla  and  Stephens  (1981),  carried  out  this  procedure  to 
obtain  points  D  for  use  in  testing  for  a  Weibull  distribution  with  unknown 
parameters,  and  found  that  as  k  became  larger  so  that  m  -  1/k  -»  0,  the 
plot  of  a  typical  percentile  of  D  against  m  was  not  monotonic.  In  the 
end,  it  then  becomes  difficult  if  not  impossible  to  extrapolate  to  get 
asymptotic  percentage  points. 

Another  method  exists  of  constructing  an  approximation  to  the  process 
Z(t).  Since  p(s,t)  is  positive  semi-definite,  one  can  proceed  as  follows. 

(a)  Solve  the  integral  equation 

1 

^(s)  -  >i  J  p(s,t)fi(t)dt 
0 

for  eigenvalues  and  eigenfunctions  f^(t). 

(b)  Let  u^.u^,...,^  be  a  set  of  k  independent  standard  normal  variables. 

k 

(c)  Let  Z^Ct)  -  2  f ^ (t)Uj//A^ .  Then  as  k  -»  «,  Z^(t)  tends  to  Z(t),  say; 

Z(t)  is  a  Gaussian  process  with  mean  0  and  covariance 

co 

p(s,t)  -  2  f. (s)f  (t)/A  and  by  well-known  properties  of  integral 

i-1 
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equations , this  is  equal  to  p(s,t),  the  kernel  of  the  equation  in  (a).  Thus 


Z^(t)  can  be  regarded  as  a  k-th  order  approximation  to  Z(t).  Here  the 
approximation  arises  because  in  real  calculations  the  sum  must  be  terminated 
at  a  finite  u^;  then  the  Z^(t)  can  be  calculated  at  any  point  in  the 

interval  (0,1).  Once  a  realization  has  been  made  by  the  choice  of  u. ,  the 

2 

value  of  D  or  W  can  be  found  accurately.  This  approximation  appears, 
therefore,  to  have  some  advantage:  the  difficulties  arise  in  calculating  the 
f^(t)  and  the  A^.  It  would  be  interesting  to  see  this  technique  explored 

further:  a  good  problem  on  which  to  test  it  would  be  that  of  finding  the 

2 

distribution  of  D  or  W  when  parameters  in  the  tested  distribution  are 

2 

fully  specified:  then  the  distributions  of  D  and  W  are  both  exactly 

known.  If  the  technique  is  successful,  it  could  be  used  to  find  the 

distribution  of  D  for  cases  where  parameters  are  unknown.  The  accuracy  can 

2 

then  be  tested  by  finding  the  points  for  W  ,  and  comparing  with  the  exact 

points,  which  are  known  for  this  statistic;  these  points  are  given,  for  many 

distributions,  in  Stephens  (1986a).  To  find  the  distribution  of  D,  the 

Kolmogorov  statistic,  in  cases  where  parameters  are  unknown  is  a  problem  of 

considerable  interest,  since  D  is  a  well-known  statistic  for  testing  fit; 

2 

although  in  fact  it  is  often  much  less  powerful  than  W  or  the  related 

2 

Anderson-Darling  statistic  A  (see,  for  example,  Stephens  1974,  1986a). 


In  the  above  discussion  we  referred  to  the  method  of  estimating 
asymptotic  points  of  a  distribution,  as  a  parameter  (k  above)  tends  to 
infinity,  by  plotting  points  for  finite  k  against  1/k  or  l//k,  and 
extrapolating  a  curve  through  these  points  to  1/k  -  0.  How  this 
extrapolation  should  be  done  is  itself  a  problem.  It  often  arises  when 
asymptotic  points  (as  sample  size  n  tends  to  infinity)  are  required  for 
making  tables,  say,  and  are  to  be  obtained  by  extrapolating  from  Monte  Carlo 
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results  for  finite  samples.  Suppose  Monte  Carlo  experiments  have  given 


percentage  points  estimates  £  for  level  p  and  sample  size  n.  It  is 

valuable  to  plot  £  against  1/n,  say,  and  then  an  extrapolation  to 

1/n  -  0  should  give  an  estimate  of  the  asymptotic  point.  However,  how 

should  this  be  done?  It  may  even  be  known  that  the  value  £  can  be 

pn 

2  3 

<_  pressed  as  £  -  a  +  a,m  +  a„m  +  a,m  .  where  m-l/n  or  l//n.  The 

^pn  o  1  2  3 

problem  is  then  how  best  to  estimate  a  from  estimates  £  ?  We  raise  the 

o  ^  pn 

question  here  because  this  appears  to  be  an  important  practical  problem  in 

preparing  tables  of  points,  one  which  appears  to  need  further  examination. 

Knowledge  of  the  values  of  a^.a^,...  above  would  also  be  helpful  to  derive 

2 

modified  forms  of  test  statistics,  for  example,  of  D  or  W  ,  such  as  are 
used  in  some  tables  in  Stephens  (1986a).  Such  forms  have  the  merit  of 
drastically  reducing  the  size  of  tables,  and  of  making  computerization  of 
tables  much  easier. 


4.  APPROXIMATING  THE  INVERSE  OF  A  COVARIANCE 
MATRIX  OF  ORDER  STATISTICS 

For  some  techniques  of  testing  fit,  based  ultimately  on  the  idea  of 

probability  plots,  one  needs  V  \  the  inverse  of  V,  where  V  is  the 

covariance  matrix  of  the  order  statistics  X,, . ,X, ,X.  .  of  a  sample 

(1)  (2)  (n) 

from  a  completely  specified  distribution.  For  a  review  of  such  tests  see 
Stephens  (1986b).  The  most  notable  example  of  the  use  of  V  ^  is  with  the 
Shapiro-Wilk  test  for  normality,  where  the  order  statistics  come  from  the 
normal  distribution  with  mean  0  and  variance  1. 

When  X  has  the  uniform  distribution  with  limts  0  and  1,  the  covariance 
matrix  Q  with  entries  -  covariance  (X^  ,X^  ^ )  is  given  by 

q^  -  ( i/(n+l) ) { 1- j/(n+l) )/(n+2) ,  1  <  i  <  j  <  n.  The  other  entries,  for 
i  >  j,  are  obtained  from  the  symmetry  of  Q.  More  generally,  if  X  has 


For  the  normal  case,  Davis  and  Stephens  (1977)  used  various  identities 
to  give  a  good  approximation  to  V  ,  and  this  can  be  inverted  to  give  V  . 
Can  similar  identities  be  used  to  give  accurate  approximations  to  V  for 
other  distributions,  so  that  V  can  then  be  inverted  to  give  V 
accurately?  Even  if  such  identities  were  available,  this  technique  seems  a 
rather  indirect  way  to  approximate  V  ^ ;  can  an  accurate  approximation  for 
V  ^  be  found  more  directly?  These  are  useful  questions  to  answer  not  only 
in  connection  with  tests  of  fit,  but  also  for  estimating  parameters  using 
linear  combinations  of  or  >r  statistics. 
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Table  1 


Expected  values  of  order  statistics  from  the  exponential  distribution, 

used  in  estimates  and  confidence  intervals  for  £  . 

P 


n 

k 

"He 

P  - 

bias 

X 

io4 

.95;  True  £  - 

P 

r  m 

r 

2.9957 

s 

m 

s 

L  : Exact 
s 

Expect -Expected 
ed  C.I. 

C . I .  Length 

length  (approx.) 

100 

96 

3 . 1040 

1083 

92 

2.4695 

100 

5.1874 

2.718 

1.802 

400 

381 

3.0222 

265 

372 

2.6428 

390 

3.6410 

.998 

.866 

500 

476 

3.0169 

212 

466 

2.6746 

486 

3.5413 

.867 

.772 

1000 

951 

3.0063 

106 

937 

2.7572 

965 

3.3387 

.582 

.543 

2000 

1901 

3.0010 

53 

1882 

2.8262 

1920 

3.2129 

.387 

.383 

5000 

4751 

2.9978 

21 

4721 

2.8843 

4781 

3.1259 

.242 

.242 

100C0 

9501 

2.9968 

11 

9458 

2.9142 

9544 

3.0868 

.173 

.171 

100000 

95001 

2.9958 

1  ' 

94866 

2.9692 

95136 

3.0232 

.054 

.054 

P  “ 

.975; 

True  £  - 

3.6889 

100 

98 

3.6874 

-15 

95 

2.9040 

- 

- 

- 

- 

400 

391 

3.7410 

521 

385 

3.2517 

397 

4.7366 

1.485 

1.221 

500 

488 

3.6896 

7 

481 

3.2451 

495 

4.5095 

1.264 

1.093 

1000 

976 

3.7095 

206 

966 

3.3673 

986 

4.2339 

.867 

.773 

2000 

1951 

3.6992 

103 

1937 

3.4501 

1965 

4.0316 

.582 

.547 

5000 

4876 

3.6930 

41 

4854 

3.5303 

4898 

3.8874 

.357 

.346 

10000 

9751 

3.6909 

20 

9720 

3.5738 

9782 

3.8236 

.250 

.248 

00000 

97501 

3.6891 

2 

97404 

3.6510 

97598 

3.7287 

.078 

.077 
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Estimates  for  the  95%  point  of  a  standard  normal  distribution 
(true  value  -  1.645). 
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Ave r a ge  Bias  x  l(r  -  1.8:  Expected  bias  -  4.1  Average  Bias  x  10  *  9:  Expected  bias 


Table  3 


Comparison  of  estimates  of  percentage  points  obtained  from  (a)  direct  Mcmte 
Carlo  estimates  and  (b)  Pearson  curve  fits.  The  true  distribution  is  Weibull 
with  shape  parameter  2.  The  results  are  based  on  50  Monte  Carlo  runs,  each 
with  sample  size  n  -  100  or  n  -  1000.  The  alpha-levels  are  measured  from 

the  lower  tail. 


Part  1 .  n  -  100 . 


Monte  Carlo  Pearson  Curve  Type  1  Pearson  Curve  Type  2 


A 

V 

B 

M 


A 

V 

B 

M 


A 

V 

B 

M 


A 

V 

B 

M 


level  -  0.10 

Exact 

Perc.  Pt.  -  0.324593 

0.33730 

0.32783 

0.32510 

0.00363 

0.00221 

0.00218 

0.01271 

0.00323 

0.00050 

0.00379 

0.00223 

0.00218 

level  -  0.50 

Exact 

Perc.  Pt.  —  0.832555 

0.82924 

0.81905 

0.82088 

0.00411 

0.00328 

0.00307 

-0.00331 

-0.01350 

0.01167 

0.00412 

0.00346 

0.00320 

level  -  0.90 

Exact 

Perc.  Pt.  -  1.517427 

1.52954 

1.50708 

1.50560 

0.01196 

0.00777 

0.00675 

0.01211 

-0.01035 

-0.01182 

0.01211 

0.00788 

0.00689 

level  -  0.99 

Exact 

Perc.  Pt.  -  2.145966 

2.21615 

2.07182 

2.07348 

0.05308 

0.02907 

0.03337 

0.07018 

-0.07415 

-0.07248 

0.05801 

0.03457 

0.03862 

Part  2.  n  -  1000. 


Monte  Carlo  Pearson  Curve  Type  1  Pearson  Curve  Type  2 


Alpha  level  -  0.10  Exact  Perc.  Pt.  -  0.324593 


A  0.32454 
V  0.00033 
B  -0.00005 
M  0.00033 


0.32768 

0.00025 

0.00309 

0.00026 


A  -  Average;  V  -  Variance;  B 


Bias ;  M  -  M. S . E. 


0.32726 

0.00024 

0.00267 

0.00025 
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Monte  Carlo 


Pearson  Curve  Type  1 


Pearson  Curve  Type  2 


Alpha  level  -  0.050 

A  0.83302 
V  0.00031 
B  0.00046 
M  0.00031 


Exact  Perc.  Pt.  -  0 

0.83106 

0.00028 

-0.00149 

0.00028 


832555 

0.82760 

0.00027 

-0.00495 

0.00030 


Alpha  level  -  0.90  Exact  Perc.  Pt.  -  1.517427 


A  1.52723 
V  0.00090 
B  0.00980 
M  0.00100 


1.52818 

0.00066 

0.01075 

0.00078 


1.53567 

0.00052 

0.01824 

0.00085 


Alpha  level  -  0.99  Exact  Perc.  Pt.  -  2.145966 


A  2.15864  2.15196 
V  0.00506  0.00241 
B  0.01268  0.00600 
M  0.00523  0.00245 


2.14043 

0.00260 

-0.00554 

0.00263 


Table  4 


Comparison  of  estimates  of  percentage  points  (see  Table  3) .  True 
distributions:  sum  of  weighted  chi-squares. 


Part  1.  n  *  100. 

Monte  Carlo  Pearson  Curve  Type  1  Pearson  Curve  Type  2 


Alpha 

level  -0.1.0 

Exact 

Perc .  Pt .  - 

0.342 

A 

0.34954 

0.36148 

0.32372 

V 

0.00170 

0.00196 

0.00154 

B 

0.00754 

0.01948 

-0.01828 

M 

0.00176 

0.00234 

0.00188 

Alpha 

level  -  0.50 

Exact 

Perc.  Pt.  - 

0.862 

A 

0.86908 

0.83069 

0.86897 

V 

0.00382 

0.00426 

0.00298 

B 

0.00708 

-0.03131 

0.00697 

M 

0.00388 

0.00524 

0.00303 

A  -  Average;  V  »  Variance;  B  -  Bias;  M  -  M.S.E. 
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Monte  Carlo 


Alpha  level  -  0.90  Exact  Perc.  Pt.  -  1.831 


A 

1.90994 

1.90516 

1.86493 

V 

0.04637 

0.02837 

0.02400 

B 

0.07894 

0.07416 

0.03393 

M 

0.05260 

0.03387 

0.02515 

Alpha 

level  -  0.99 

Exact  Perc.  Pt.  -  3.087 

A 

3.59271 

3.12139 

3.11789 

V 

0.65287 

0.19780 

0.16248 

B 

0.50571 

0.03439 

0.03089 

M 

0.90861 

0.19898 

0.16343 

Part  2.  n  -  1000. 


Alpha 

level  -  0.10 

Exact 

Perc.  Pt.  - 

0.342 

A 

0.34527 

0.34522 

0.33629 

V 

0.00018 

0.00016 

0.00022 

B 

0.00327 

0.00322 

-0.00571 

M 

0.00019 

0.00017 

0.00025 

Alpha 

Level  -  0.50 

Exact 

Perc.  Pt.  - 

0.862 

A 

0.86873 

0.8607 4 

0.87275 

V 

0.00057 

0.00057 

0 . 00044 

B 

0.00673 

-0.00126 

0.01075 

M 

0.00062 

0.00057 

0.00056 

Alpha 

level  -  0.90 

Exact 

Perc.  Pt.  - 

1.831 

A 

1.85648 

1.86264 

1.84780 

V 

0.00325 

0.00241 

0.00217 

B 

0.02548 

0.03164 

0.01680 

M 

0.00390 

0.00341 

0.00245 

Alpha 

level  -  0.99 

Exact 

Perc.  Pt.  - 

3.087 

A 

3.12492 

3.08379 

3.08268 

V 

0.02997 

0.01909 

0.01747 

B 

0.03792 

-0.00321 

-0.00432 

M 

0.03141 

0.01910 

0.01749 

A  -  Average;  V  -  Variance;  B  -  Bias;  M  —  M.S.E. 
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